Overview of an Assembler Updated 22-Oct-2024 ------------------------ Input to the assembler o A line may be blank o A line may be just a comment o A line may be just a label o A line may contain a single instruction o A line may contain a pseudo instruction o A line may contain an assembler directive o Instructions o Optional label o Opcode o Operands o If there are multiple operands, they are usually separated by a comma o Optional comment: delimited by a pound-sign, everything after on that line is a comment o White space (spaces, tabs) must separate: o the opcode from any operands o White space (spaces, tabs) may appear in a variety of places: o between the label and the opcode o before the opcode (if there is no label) o between an operand and a following comma o between the comma following an operand and a following operand o before a comment delimiter o Often white space is required before the opcode of an instruction, if that input line does not contain a label -- even if not required, it is traditional to use tabs to line up the beginning of all opcodes, the beginning of the operand field, and the beginning of comments that are on instruction lines o Labels o Appear before the opcode and are followed by a colon o Forward references o Need to maintain a symbol table for labels o A reference may be made to a symbol that is not yet defined o Therefore, the location of that label is not yet known o Solutions: o Two passes o Maintain a list of where forward references are made and fix them later when the label is defined o May be difficult when labels are used in complicated expressions o Two passes is usually the easier solution to implement o First pass through the source assembly file simply updates the location counter appropriately and enters definitions of labels into a symbol table o Second pass through the source assembly file generates the code and data definition o The generated code and data may be emitted as each instruction or data definition is seen or stored in an array and emitted after the entire source file has been fully processed o Pseudo instructions o Look like instructions, but created by the assembler o May be a single instruction or more than one instruction o May use one (or more) registers reserved for the assembler that are not seen by the assembly language programmer o Directives o To reserve space .space # skip over specified number of bytes o To reserve space and initialize it to particular values .word ,, ... , # .word also forces alignment to a word boundary .half

,

, ... , # .half also forces alignment to a halfword boundary .byte ,, ... , .ascii # string in double quotes .asciiz # null terminated string o To set the location counter .org o To set a label to a specified value label: .equ o To force alignment .align # align next item on a 2^n byte boundary o Arithmetic expressions o decimal numbers o hexadecimal numbers o octal numbers o some C-like operators o labels can be used as the location counter where they are defined o ASCII character constants (in single quotes) o $ by itself is the current value of the location counter The output of the assembler should be a MIF (Memory Initialization File) o See https://cscie93.dce.harvard.edu/fall2024/def_mif.htm o Note that the value for the DEPTH parameter must be 16384, the WIDTH parameter must be 16 for the DE2-70 o Note that the value for the DEPTH parameter must be 32768, the WIDTH parameter must be 16 for the DE2-115 The model for our memory system: o See https://cscie93.dce.harvard.edu/fall2024/io_interface.txt The description of the processor-memory interface: o See https://cscie93.dce.harvard.edu/fall2024/processor_memory_interface.txt